Identifiability of latent class models with many observed variables

نویسندگان

  • Elizabeth S. Allman
  • Catherine Matias
  • John A. Rhodes
چکیده

While latent class models of various types arise in many statistical applications, it is often difficult to establish their identifiability. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability, utilizing algebraic arguments. A theorem of J. Kruskal for a simple latent class model with finite state space lies at the core of our results, though we apply it to a diverse set of models. These include mixtures of both finite and non-parametric product distributions, hidden Markov models, and random graph models, and lead to a number of new results and improvements to old ones. In the parametric setting we argue that the classical definition of identifiability is too strong, and should be replaced by the concept of generic identifiability. Generic identifiability implies that the set of non-identifiable parameters has zero measure, so that the model remains useful for inference. In particular, this sheds light on the properties of finite mixtures of Bernoulli products, which have been used for decades despite being known to be non-identifiable models. In the non-parametric setting, we again obtain identifiability only when certain restrictions are placed on the distributions that are mixed, but we explicitly describe the conditions. AMS 2000 subject classifications: Primary 62E10; secondary 62F99,62G99. ∗The authors thank the Isaac Newton Institute for the Mathematical Sciences and the Statistical and Applied Mathematical Sciences Institute for their support during residencies in which some of this work was undertaken. ESA and JAR also thank the National Science Foundation, for support from NSF grant DMS 0714830. 1 ar X iv :0 80 9. 50 32 v1 [ m at h. ST ] 2 9 Se p 20 08 E.S. Allman, C. Matias and J.A. Rhodes/Identifiability of latent class models 2

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifiability of Parameters in Latent Structure Models with Many Observed Variables

While hidden class models of various types arise in many statistical applications, it is often difficult to establish the identifiability of their parameters. Focusing on models in which there is some structure of independence of some of the observed variables conditioned on hidden ones, we demonstrate a general approach for establishing identifiability utilizing algebraic arguments. A theorem ...

متن کامل

Noisy-OR Models with Latent Confounding

Given a set of experiments in which varying subsets of observed variables are subject to intervention, we consider the problem of identifiability of causal models exhibiting latent confounding. While identifiability is trivial when each experiment intervenes on a large number of variables, the situation is more complicated when only one or a few variables are subject to intervention per experim...

متن کامل

Identifiability of extended latent class models with individual covariates

Identifiability for a very flexible family of latent class models introduced recently is examined. These models allow for a conditional association between selected pairs of response variables conditionally on the latent and are based on logistic regression models both for the latent weights and for the conditional distributions of the response variables in terms of subject specific covariates....

متن کامل

Diagonal Orthant Multinomial Probit Models

Bayesian classification commonly relies on probit models, with data augmentation algorithms used for posterior computation. By imputing latent Gaussian variables, one can often trivially adapt computational approaches used in Gaussian models. However, MCMC for multinomial probit (MNP) models can be inefficient in practice due to high posterior dependence between latent variables and parameters,...

متن کامل

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabular...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008